To catch a fake: Curbing deceptive Yelp ratings and venues

نویسندگان

  • Mahmudur Rahman
  • Bogdan Carbunar
  • Jaime Ballesteros
  • Duen Horng Chau
چکیده

The popularity and influence of reviews, make sites like Yelp ideal targets for malicious behaviors. We present Marco, a novel system that exploits the unique combination of social, spatial and temporal signals gleaned from Yelp, to detect venues whose ratings are impacted by fraudulent reviews. Marco increases the cost and complexity of attacks, by imposing a tradeoff on fraudsters, between their ability to impact venue ratings and their ability to remain undetected. We contribute a new dataset to the community, which consists of both ground truth and gold standard data. We show that Marco significantly outperforms state-of-the-art approaches, by achieving 94% accuracy in classifying reviews as fraudulent or genuine, and 95.8% accuracy in classifying venues as deceptive or legitimate. Marco successfully flagged 244 deceptive venues from our large dataset with 7,435 venues, 270,121 reviews and 195,417 users. Furthermore, we use Marco to evaluate the impact of Yelp events, organized for elite reviewers, on the hosting venues. We collect data from 149 Yelp elite events throughout the US. We show that two weeks after an event, twice as many hosting venues experience a significant rating boost rather than a negative impact. © 2015 Wiley Periodicals, Inc. Statistical Analysis and Data Mining: The ASA Data Science Journal 8: 147–161, 2015

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Turning the Tide: Curbing Deceptive Yelp Behaviors

The popularity and influence of reviews, make sites like Yelp ideal targets for malicious behaviors. We present Marco, a novel system that exploits the unique combination of social, spatial and temporal signals gleaned from Yelp, to detect venues whose ratings are impacted by fraudulent reviews. Marco increases the cost and complexity of attacks, by imposing a tradeoff on fraudsters, between th...

متن کامل

Spatial analysis of users-generated ratings of yelp venues

Background: With popular location-based services on smart phones, users are willing to leave comments on the business venues (e.g., restaurants, shops, bars, etc.) that they visited. Reviews of users on Yelp venues somewhat indicate satisfaction of customers with services of those venues. Those reviews could be used to reflect service quality of business venues. Geo-localized venues could tell ...

متن کامل

What Yelp Fake Review Filter Might Be Doing?

Online reviews have become a valuable resource for decision making. However, its usefulness brings forth a curse ‒ deceptive opinion spam. In recent years, fake review detection has attracted significant attention. However, most review sites still do not publicly filter fake reviews. Yelp is an exception which has been filtering reviews over the past few years. However, Yelp’s algorithm is trad...

متن کامل

Spotting Suspicious Reviews via (Quasi-)clique Extraction

How to tell if a review is real or fake? What does the underworld of fraudulent reviewing look like? Detecting suspicious reviews has become a major issue for many online services. We propose the use of a clique-finding approach to discover well-organized suspicious reviewers. From a Yelp dataset with over one million reviews, we construct multiple Reviewer Similarity graphs to link users that ...

متن کامل

Poster: Spotting Suspicious Reviews via (Quasi-)clique Extraction

How to tell if a review is real or fake? What does the underworld of fraudulent reviewing look like? Detecting suspicious reviews has become a major issue for many online services. We propose the use of a clique-finding approach to discover well-organized suspicious reviewers. From a Yelp dataset with over one million reviews, we construct multiple Reviewer Similarity graphs to link users that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Statistical Analysis and Data Mining

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2015